Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data.

نویسندگان

  • Nicolas Borisov
  • Maria Suntsova
  • Maxim Sorokin
  • Andrew Garazha
  • Olga Kovalchuk
  • Alexander Aliper
  • Elena Ilnitskaya
  • Ksenia Lezhnina
  • Mikhail Korzinkin
  • Victor Tkachev
  • Vyacheslav Saenko
  • Yury Saenko
  • Dmitry G Sokov
  • Nurshat M Gaifullin
  • Kirill Kashintsev
  • Valery Shirokorad
  • Irina Shabalina
  • Alex Zhavoronkov
  • Bhubaneswar Mishra
  • Charles R Cantor
  • Anton Buzdin
چکیده

High throughput technologies opened a new era in biomedicine by enabling massive analysis of gene expression at both RNA and protein levels. Unfortunately, expression data obtained in different experiments are often poorly compatible, even for the same biologic samples. Here, using experimental and bioinformatic investigation of major experimental platforms, we show that aggregation of gene expression data at the level of molecular pathways helps to diminish cross- and intra-platform bias otherwise clearly seen at the level of individual genes. We created a mathematical model of cumulative suppression of data variation that predicts the ideal parameters and the optimal size of a molecular pathway. We compared the abilities to aggregate experimental molecular data for the 5 alternative methods, also evaluated by their capacity to retain meaningful features of biologic samples. The bioinformatic method OncoFinder showed optimal performance in both tests and should be very useful for future cross-platform data analyses.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I-3: Human Y Chromosome Proteome Project 2012 Update

The Human Genome Project has generated a blueprint for the approximately 20,300 gene-encoded proteins potentially active in any of 230 cell types that make up the human body (human proteome). However, based on the UniProtKB/Swiss-Prot database content, about 6000 of at the protein level; for many others, there is very little information related to protein function, abundance, subcellular locali...

متن کامل

Potential biological insights revealed by an integrated assessment of proteomic and transcriptomic data in human colorectal cancer.

In the post-genomic era, the main aim of cancer research is organizing the large amount of data on gene expression and protein abundance into a meaningful biological context. Performing integrated analysis of genomic and proteomic data sets is a challenging task. To comprehensively assess the correlation between mRNA and protein expression, we focused on the gene set enrichment analysis, a rece...

متن کامل

Molecular docking and in silico ADME prediction of Ticagrelor as an antagonist of the P2Y12 receptor

The purpose of the present research work is prediction of electronic and physico-chemical properties of the novel medicinal compound Ticagrelor (AZD6140) using density functional theory (DFT) method. Firstly, its molecular structure was optimized at B3LYP/6-311++G(d,p) basis set of theory at room temperature. The global reactivity indices used to study the reactivity and stability of the title ...

متن کامل

Modeling Signal Transduction from Protein Phosphorylation to Gene Expression

BACKGROUND Signaling networks are of great importance for us to understand the cell's regulatory mechanism. The rise of large-scale genomic and proteomic data, and prior biological knowledge has paved the way for the reconstruction and discovery of novel signaling pathways in a data-driven manner. In this study, we investigate computational methods that integrate proteomics and transcriptomic d...

متن کامل

Integrative analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: a non-linear model to predict abundance of undetected proteins

MOTIVATION Gene expression profiling technologies can generally produce mRNA abundance data for all genes in a genome. A dearth of proteomic data persists because identification range and sensitivity of proteomic measurements lag behind those of transcriptomic measurements. Using partial proteomic data, it is likely that integrative transcriptomic and proteomic analysis may introduce significan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Cell cycle

دوره 16 19  شماره 

صفحات  -

تاریخ انتشار 2017